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Impact Evaluation of Reading /-Ready Instruction for Elementary Grades 
using 2018-19 Data 


Abstract 


Curriculum Associates’ i-Ready® Instruction is a supplemental, online personalized instruction 
program available for reading and mathematics. The Human Resources Research 
Organization (HumMRRO), in collaboration with Century Analytics, implemented a quasi- 
experimental design (QED) using 2018-19 i-Ready Diagnostic and Instruction data to evaluate 
the impact of Curriculum Associates’ reading /-Ready Instruction on student reading 
achievement at grades K—-5. We hypothesized student achievement, as measured by the /- 
Ready® Diagnostic, would be higher for students using i-Ready Instruction for reading over a 
comparison group of students who did not use this instruction. We conducted matching to 
identify a set of comparison students demographically similar to our ij-Ready Instruction 
treatment students for each grade level. First, we stratified our sample by gender, English 
learner status, disability status, and economic disadvantage status. Next, we used propensity 
score matching to identify analytic samples of i-Ready Instruction and comparison students 
matched on baseline reading student achievement. Students who received the /-Ready 
Instruction and students in the comparison group were administered the reading i-Ready 
Diagnostic assessments. To evaluate impact, hierarchical-linear modeling (HLM) was 
conducted separately for each analytic sample with students at level 1 and school at level 2. 
Results suggest students using /-Ready Instruction with fidelity performed statistically 
significantly better on reading performance than students in grades K—5 who did not use this 
instruction. The effect sizes fall within or exceed (in the case of kindergarten) the range for 
which recent research by Kraft (2019) has found is typical of education interventions. These 
findings provide support that, when used with fidelity, student use of i-Ready Instruction for 
reading is tied to higher student reading achievement. 


Introduction 


Founded in 1969, Curriculum Associates provides a variety of educational products and 
services with the goal of improving education for students and teachers. Two Curriculum 
Associates products include i-Ready® Diagnostic (available for K-12) and i-Ready® Instruction 
(available for K-8). The /-Ready Diagnostic assessments (a) are online, computer-adaptive 
assessments that pinpoint student needs at the sub-skill level and (b) help monitor the extent to 
which students are on track to achieve end-of-year targets. The i-Ready Diagnostic 
assessments are independent measures often used by educators as classroom benchmark 
assessments. They can be used with or without /-Ready Instruction. We provide additional 
information on the validity and reliability of the i-Ready Diagnostic as a measure of student 
achievement in our methodology discussion below. i-Ready Instruction is a supplemental 
program that provides online, individualized instruction adjusted to student needs. 


The Human Resources Research Organization (HumRRO) is an independent research 
organization that specializes in program evaluation and quantitative methodology. Century 
Analytics is a small business with various education research expertise including quasi- 
experimental design and What Works Clearinghouse (WWC) standards. 


1 https:/Awww.curriculumassociates.com/products/i-ready 
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HumRRO and Century Analytics conducted an evaluation to examine the impact of i-Ready 
Instruction on reading achievement for students in elementary grades K—5 using 2018-19 data. 
This was one in a series of evaluations examining the impact of Curriculum Associates’ 
interventions on student achievement. This study was designed to meet the required rigor of the 
WWC 4.0 standards to achieve a rating of Meets WWC Group Design Standards with 
Reservations (WWC, 2017a), and to meet guidelines for a Level 2 (or Moderate) rating for the 
Every Student Succeeds Act (ESSA) guidance for evidence-based research (U.S. Department 
of Education, 2016). To accomplish this, we used a quasi-experimental design (QED), 
established baseline equivalence between the treatment and comparison groups, included 
baseline achievement as a covariate, and used a sampling design that mitigates the effects of 
any confounding factors. 


There were key differences between this study and past studies. Specifically, previous studies 
considered school as the unit of /-Ready Instruction assignment, whereas this study considered 
student as the unit of assignment. This change in unit of assignment acknowledges the inherent 
flexibility of i-Ready Instruction implementation. For example, some schools may implement at 
the school-level, the grade-level, or the classroom-level, while other schools may implement /- 
Ready Instruction at the individual student-level so they can target specific groups of students. 
In addition, our past studies included only schools using /-Ready Diagnostic and Instruction, or i- 
Ready Diagnostic only for the comparison group, with general education students. Thus, those 
schools using /-Ready Diagnostic (with or without /nstruction) with select subsets of students 
were removed from our sample. Because our data support various types of implementation 
occurring across schools, and we understand it is Curriculum Associates intent that these 
different implementations are valid uses, this study includes students from schools that are 
implementing /-Ready Diagnostic with or without /nstruction in a variety of ways. 


Defining i-Ready Instruction 


The impact of /-Ready Instruction on student achievement was the focus of this evaluation. /- 
Ready Instruction is an online personalized instruction program aligned to college- and career- 
ready standards that includes engaging multimedia instruction and progress monitoring into 
online lessons. Lessons are intended to provide a consistent best-practice lesson structure and 
build students’ conceptual understanding. /-Ready Instruction is intended to be used in 
conjunction with i-Ready Diagnostic which monitors student progress and identifies student 
performance in reading. This diagnostic information helps target student-specific intervention, 
which can be provided through /-Ready Instruction. 


Curriculum Associates developed a Theory of Action (TOA) that features the key 
implementation components of i-Ready Instruction, the intended intermediate outcomes, and 
the intended long-term outcomes. The key implementation components highlight actions 
recommended by students, teachers, and leaders to obtain the long-term outcome of improved 
student learning in reading and mathematics. Among others, the key components include 
support at the school and district leadership levels, monitoring of student progress by teachers, 
and student use of i-Ready Instruction to work through a personalized, scaffolded instruction 
path. The /-Ready Instruction TOA is provided in Appendix A. 


Curriculum Associates provides guidance to districts and schools on how to implement /-Ready 
Instruction to best benefit student learning (Curriculum Associates, 2019). Guidance indicates 
students achieve greater gains when using /-Ready Instruction for an average of at least 30 
minutes per week, per subject area. In addition, Curriculum Associates recommends use for 12 
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to 18 calendar weeks between two administrations of the i-Ready Diagnostic (Curriculum 
Associates, 2018). 


Research Questions 


The purpose of this study was to determine the impact of i-Ready Instruction on student 
achievement in reading. We examined the following key research question separately for each 
grade K—5 of our study: 


Do students who use /-Ready Instruction for reading have higher reading achievement 
as measured by the i-Ready Diagnostic than students who use /-Ready Diagnostic only? 


We hypothesized that student achievement for reading would be higher for students who used /- 
Ready Instruction with fidelity, based on the criteria described in the TOA and user guidance 
(Curriculum Associates, 2019). Our hypothesis was based on the belief that students benefit 
from the i-Ready Instruction targeted to their specific needs in reading. 


Methodology 


In this section, we describe the methodology for conducting our impact analysis. We begin with 
initial design decisions. We then discuss the student selection and matching process as well as 
our analytic model and examination of baseline equivalence. Finally, we discuss our impact 
analysis results. 


Initial Design Decisions 
Cluster-Level Design 


We used the student as the unit of assignment for this study to acknowledge the flexibility 
intended by /-Ready Instruction and to include students from schools with various 
implementation types. Matching was conducted at the student-level and, thus, the analytic 
model examined the outcome at the student level. However, we also considered potential 
influence of school-level factors and thus decided to include a two-level analytic model with 
school characteristics at level 2 and students at level 1. 


Baseline and Outcome Measure 


We selected the i-Ready Diagnostic as both the baseline and outcome measure for all students 
participating in this study (i.e., -Ready Instruction students and comparison group students). /- 
Ready Diagnostic for reading measures achievement aligned to common reading content and 
skills with demonstrated test score reliability. Marginal reliabilities range from 0.91—0.97 and 
test-retest reliabilities range from 0.70—0.86 for reading through grade 5. Therefore, this 
assessment meets the WWC 4.0 standards for an acceptable baseline and outcome measure 
(WWC, 2017a). 


The i-Ready Diagnostic assessments align to college- and career-ready standards so that 
results can inform student placement decisions, offer explicit instructional advice, and prescribe 
resources for targeted instruction and intervention. The assessments are used by some schools 
and districts in conjunction with i-Ready Instruction and by others as a stand-alone diagnostic 
assessment without the use of /-Ready Instruction. The i-Ready Diagnostic assessments for 
mathematics and reading are currently used by more than 6.5 million students across the United 
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States. Thus, the use of i-Ready Diagnostic as the outcome measure allowed us to include a 
large sample of students from across the United States. The /-Ready Diagnostic is intended to 
be administered in a standardized manner across schools (Curriculum Associates, 2019b). 
Specifically, teachers are to schedule the first (fall) Diagnostic 2-3 weeks into the school year in 
two 45—50-minute sessions. Teachers also are encouraged to test technology to ensure proper 
function and have pencils and paper available as scratch paper. Test administrators provide 
instructions to their students and motivate them to do their best. Teachers monitor students as 
they complete the assessments. 


Multiple studies have been conducted to support the reliability and validity of the reading /- 
Ready Diagnostic as well as its consistency with education standards used across the United 
States. Since being released in summer 2011, i-Ready Diagnostic has been reviewed and 
approved at the state level as an assessment, instructional resource, or intervention in Arizona, 
California, Colorado, Connecticut, Delaware, Florida, Georgia, Idaho, Indiana, Massachusetts, 
Mississippi, Nevada, New Mexico, New York, North Carolina, Ohio, Oklahoma, Oregon, 
Tennessee, Utah, and Virginia. 


Curriculum Associates has conducted multiple linking studies examining i-Ready Diagnostic 
scores for reading at grades 3-8 that provide evidence the /-Ready Diagnostic measures skills 
consistent with student expectations and can be used as a student reading achievement 
measure. For example, a study using 2016 data examined the correlation between /-Ready 
Diagnostic and the Smarter Balanced summative assessments, the Partnership for Assessment 
of Readiness for College and Careers (PARCC), and state testing programs in Florida, Georgia, 
Indiana, Michigan, Mississippi, New York, North Carolina, Ohio, and Tennessee. These studies 
show strong correlations between /-Ready Diagnostic scores and scores on these national and 
state tests. The average correlations across grades between the /-Ready Diagnostic for reading 
ranged from 0.78 (Tennessee TNReady) and 0.85 (Smarter Balanced). These studies also 
provide evidence that the /-Ready Diagnostic content is highly consistent with what students 
across the United States are expected to learn (Curriculum Associates, 2019). Curriculum 
Associates recently completed linking studies for Colorado, Kentucky, and Missouri. In addition, 
Curriculum Associates has commissioned Odell Education and others to complete alignment 
studies to demonstrate the degree of alignment between the content on i-Ready Diagnostic and 
current sets of state standards. Specifically, they have conducted alignment studies for the 
Common Core State Standards (CCSS), and for the Florida, Indiana, Louisiana, Michigan, Ohio, 
and South Carolina state standards. 


Required Number of Students 


We conducted power analyses using Optimal Design software (Spybrook et al., 2011) to identify 
the total number of students required at each grade level to reject the null hypothesis that there 
is no difference in student reading achievement between the treatment and comparison group. 
Statistical power is influenced by various factors. We used data from previous studies HumRRO 
conducted using /-Ready Diagnostic as an outcome to estimate conservative and optimistic 
parameters for use in the power analysis. These parameters were: (a) 0.90 for the relationship 
between the baseline and outcome variable, and (b) 0.10 and 0.30 for the intraclass correlation 
coefficient (ICC). Results of the power analyses indicated sample sizes of a minimum of 400 
students would be sufficient to reach our desired statistical power of 0.80. This level of statistical 
power provides an 80% chance of detecting a statistically significant difference with 95% 
confidence, if one exists. Our student samples across all grades far exceeded the minimum. 
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Analytic Model 


Our model for the impact analyses incorporated student- and school-variables. The baseline 
difference model used to estimate baseline equivalence for our matched sample was based on 
the impact model. As previously discussed, we chose a two-level model with level 1 as the 
student and level 2 as school. 


Impact Model 


We used HLM to estimate the impact of i-Ready Instruction on student achievement. We 
included the following student-level covariates in each analysis: 


e Group membership (0 = comparison; 1 = treatment) 
e ji-Ready Diagnostic reading baseline performance (grand mean centered) 


e Blocking variables (i.e., dummy codes) to account for strata used in matching (described 
in the matching section of this report) 


Although we considered the student to be our unit of assignment, with the understanding that 
many schools intentionally do not use /-Ready Instruction with all students, we also wanted to 
capture and control for potential school-level factors. We were especially interested in 
identifying variables that would provide unique information from the student-level variables. We 
used the following school-level covariates in each analysis: 


e Traditional school indicator (0 = K—5 structure; 1 = other) 

e Location (town, suburban, rural, city) 

e Charter/magnet school indicator (0 = not charter or magnet; 1 = charter or magnet) 
e Percent white students 

e Percent of students eligible for free and reduced price lunch (FRL) 


Our Level 1 model described the relationship between student outcomes, student-level 
characteristics, the baseline covariate, and the strata used for matching. This model level also 
included the treatment indicator. We specified level 1 of the model as follows: 


Yij = BO/ + B1j(GROUPY) + B2(PRE— PRE..) + ZBq(STRATAM) + ef 


Where Yij is the outcome for student / in school j. BO/ is the adjusted mean outcome for 
comparison students in school j. B1/ is the adjusted mean difference in outcome due to the 
student’s group membership (i.e., the treatment effect), and GROUP is an indicator variable 
coded 1 for students in the /-Ready Instruction group and 0 for students in the comparison 
group. 62; is the adjusted difference in outcome due to the student’s baseline achievement 
score (grand mean centered). Bq is a vector of blocking variables to account for the strata used 
in matching. e/ is the random error in the achievement outcome associated with student / in 
school j not accounted for in the model. 


We specified level 2 of the model as follows: 


BOj = yOO + yO1(STRUCTURE)) + yO2(CHARTER)) + yO3(PERWHITE;) + y04(PERFRL)) 
+ Zyk(LOCATION)) + u0j 
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B1j = y10 
B2j = y20 
ZBp = ypO 
ZBq = yqO 


Where y00 is the grand mean. y01 is added to control for school grade-level structure where 
STRUCTURE is coded as 0 for schools with a typical grade level structure (K—5 for elementary 
school) and 1 for schools with an atypical grade structure. y02 is the additive effect for charter or 
magnet schools. yO3 and y04 are added to control for school characteristics of percent white 
and percent FRL, respectively. Zyk is a vector of three dummy variables to control for school 
location. uOj is the random error in the achievement outcome associated with school j. The 
regression slopes for the treatment, student baseline achievement, student demographics and 
strata are fixed across schools. 


Baseline Difference Model 


We used the model below to estimate the baseline difference between students in the treatment 
group and the comparison group. This model follows the same structure as the impact analysis 
model but excludes covariates. 


We specified level 1 of the model as: 
Yij = BOj + B1/(GROUPY) + ZBq(STRATAM/) + ej 


Where Yij is the baseline for student / in school j. BO/ is the adjusted mean outcome for 
comparison students in school j. B1/ is the adjusted difference in outcome due to the student’s 
study group membership (i.e., the baseline difference), and GROUP is an indicator variable 
coded 1 for students in the /-Ready Instruction group and O for students in the comparison 
group. Bq is a vector of blocking variables to account for the strata used in matching. ejj is the 
random error in the achievement outcome associated with student / in school / not accounted for 
in the model. 


We specified level 2 of the model as: 
BOs = yOO + uD/ 
B1j = y10 
2Bq = yqO 


Identifying a Student Sample 
Defining Eligibility 


For each grade, we started with a student-level i/-Ready usage file of reading i-Ready Diagnostic 
and /-Ready Instruction use in 2018-19 for students who had at a minimum fall and spring /- 
Ready Diagnostic scores. We next filtered to include only public school students, which included 
traditional public schools and public charter and magnet schools. This ensured we were 
including only students in a relatively traditional school environment with expectations to follow 
state adopted college and career ready standards. 


We also filtered our sample based on availability of student level demographic variables that 
were identified for inclusion in matching and the impact analysis model. Only students with 
available demographic data for (a) gender, (b) English learner (EL) status, (c) special education 
status, and (d) economic disadvantage status were included. We conducted data checks prior to 
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removing schools that indicated students with available demographic data were not different on 
academic achievement, as measured by the i-Ready Diagnostic, than those who did not have 
demographic data. These checks provided assurance that data were missing at random. 
However, we also note that users of the j-Ready products tend to be of higher percentage 
minority and low income schools compared to all United States schools; thus, though we were 
confident our student sample used for matching was academically representative of the public 
school students using /-Ready Diagnostic or i-Ready Diagnostic and Instruction, we do not 
expect they are representative of all students in the United States. 


In addition, for a student to be eligible for the treatment group, they must have used /-Ready 
Instruction for reading a minimum of 18 distinct weeks for an average of at least 30 minutes per 
week (Curriculum Associates, 2018). This was consistent with guidance on the minimum /- 
Ready Instruction usage at the student level for attaining intended goals of improved student 
reading achievement. These students also needed to have attended a school that began using 
i-Ready Instruction to some extent prior to the 2018-19 school year. This requirement is based 
on the understanding that i-Ready Instruction implementation requires a start-up time to learn 
the technology and adjustments to scheduling before i-Ready Instruction is fully up and running. 
To be eligible for the comparison group, students must not have used any /-Ready Instruction 
for reading in 2018-19. We removed students not meeting the treatment or comparison 
eligibility requirements from the datafile used in matching. 


Matching 


We conducted matching at the student level using a multi-step process. Matching was 
conducted separately by grade (K—5). Thus, we conducted each matching step six separate 
times to identify six analytic samples (i.e., six grades). 


First, we stratified our sample by gender, EL status, special education status, and economic 
disadvantage status. This assured that students were only matched to students with identical 
demographic characteristics on these four variables. The variables were selected because they 
are known to be related to student achievement (Hanover Research, 2014; van Langen, Bosker, 
& Dekkers, 2006) and were available through the /-Ready usage datafiles. This stratification 
resulted in 16 strata at each grade. Each stratum contained treatment and comparison students. 
In some strata, the treatment group was larger than the comparison group, or vice versa. Within 
each stratum, we used logistic regression to compute a propensity score for each student (Guo 
& Fraser, 2010). The propensity scores predicted the chance a student belonged to the group 
(treatment or comparison) with the smallest number of students, indicated by a value ranging 
between 0 and 1, based on the fall i-Ready Diagnostic scores. We used the propensity scores 
to match each student from the smallest group (treatment or comparison) to a student from the 
largest group. We matched using the nearest neighbor method without replacement (Stuart, 
2010). Once matching was conducted for all strata within a grade, we combined the data from 
all strata into one analytic sample. 


Following specification of our analytic and baseline difference models, we removed an average 
of 3.5% of students across the six analytic samples who had incomplete data on the school- 
level variables included in the impact model. This resulted in unequal numbers of students in 
comparison and treatment groups. Figure 1 summarizes the demographic makeup of the final 
set of students in each analytic sample. The counts of students included in each group can be 
found in Table 1 on page 11. As shown, the stratification process used in matching ensured the 
i-Ready Instruction and comparison groups were highly similar on the key demographic 
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variables, despite the need to remove a small percentage of the sample to account for missing 
school-level variables. 
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Although our sampling focused on the student-level, to gain additional understanding of where 
our student-sample was from, we examined the distribution of students across urbanicity 
categories, as defined through school-level variables of the National Center for Education 
Statistics (NCES) publicly available database. Figure 2 shows that schools in the i-Ready 
Instruction and comparison groups share a relatively similar urbanicity distribution. 
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Figure 2. Students’ school urbanicity for final matched i-Ready Instruction and 
comparison samples 


Baseline Equivalence 


Once our analytic samples were identified, we used our baseline difference model to estimate 
the adjusted mean differences between our i-Ready Instruction and comparison groups of 
students at each grade level. We converted the estimated baseline difference between students 
in the two groups to an effect size to evaluate baseline equivalence for each of the six analytic 
samples. For all six samples, Hedges’ g was much smaller than the WWC required threshold of 
0.25 (see Table 1), so we determined the groups were baseline equivalent (WWC, 2017b). 
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Table 1. Reading Baseline Equivalence Statistics for i-Ready Instruction (Treatment) and 
Comparison Groups by Grade 


i-Ready |. Xo lm Were ta) Effect 


| i-Ready Instruction 2,982 352.46 32.76 0.50 0.01 

. Comparison 2,931 351.96 37.68 
‘ i-Ready Instruction 8,681 408.20 42.49 1.60 0.04 

Comparison 8,583 406.60 43.27 
i-Ready Instruction 11,344 463.85 49.98 0.04 <0.01 

Comparison 11,282 463.81 50.15 
i-Ready Instruction 14,048 500.03 49.76 -0.78 -0.02 

: Comparison 14,026 500.81 50.07 
i-Ready Instruction 15,775 528.60 50.49 -3.21 -0.06 

: Comparison 15,659 531.81 50.49 
i-Ready Instruction 17,116 553.76 50.33 -2.49 -0.05 

s Comparison 16,812 556.25 50.69 


Notes: SD = standard deviation of i-Ready scores, Adj Mean Diff = adjusted mean difference 
between /-Ready Instruction and comparison groups, and Effect Size = Hedge’s g. 


Impact Analysis Results 


After confirming our matched samples were baseline equivalent at each grade, we estimated 
the impact of /-Ready Instruction on student achievement using the analytic model described 
above with spring 2019 i-Ready Diagnostic scores as the outcome. Analyses were conducted 
separately for each grade. This section describes the results of the analysis. Full information on 
the model results, including student- and school-level covariate parameters, are presented in 
Appendix B. 


In addition to estimating the impact of i-Ready Instruction, we also examined three model 
assumptions associated with two-level HLM—residual normality, independence, and 
homoscedasticity—using the MIXED_DX macro in SAS (Bell, Smiley, Ene, & Blue, 2014). No 
major violations were found. Additional details regarding the assumption checks are available in 
Appendix C. 


Table 2 contains the impact model results by grade for reading spring /-Ready Diagnostic 
scores. For all grade levels, the adjusted mean differences were positive, indicating the i-Ready 
Instruction group earned higher scores than the matched comparison group. All mean 
differences were statistically significant (a = .05) with Hedge’s g effect sizes ranging from 0.04 
to 0.20. These effect sizes are promising for an education intervention. Lipsey et al. (2012) 
suggested an effect size of 0.25 is large for an education intervention, and those of 0.15 or 
higher could be considered modest. Thus, the effect for Kindergarten would be considered 
modest by this standard. Kraft (2019) notes traditional guidelines, including those reported by 
Lipsey, are often too rigid for the realities of education interventions He specifies effect size 
ranges of 0.03-0.17 are typical of education interventions and that these often represent a 


Impact Evaluation of Reading i-Ready Instruction for Elementary Grades using 2018-19 Data 11 


PS HuMRRO 


meaningful effect. He suggests effect sizes should be considered in conjunction with all aspects 
of an intervention, including the magnitude of the treatment contrast and costs. 


Table 2 also provides the intra-class correlations (ICCs) by grade. The ICCs measure the 
proportion of the variance between schools—that is, how much of the variance in reading /- 
Ready Diagnostic scores can be explained by school-level differences. The ICCs range from 
0.18 (grade 5) to 0.24 (Kindergarten). This suggests the majority of variance is due to factors 
other than school-level differences; however, we prefer ICCs to be below .20, and this was not 
the case for four of the six grades examined. The slightly elevated |CCs may be impacted by the 
variation in implementation methods and our decision to model implementation at the student 
level. This finding will assist in future efforts for identifying the most appropriate unit of 
assignment to account for these variations. 
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Table 2. Impact Analysis Results for i-Ready Instruction (Treatment) and Comparison Groups for Reading Student 
Achievement by Grade 


Adj Mean Effect 


Group Students Diff (SE) Size 


Beg 6T-8Tog Buisn sapely Arejuaweal/y 10j uonanaysu] Apeay-! Bulpeay Jo uonenjeay joedu uy 


eT 


“j-Ready Instruction 0.24 2,982 417.81 40.97 8.73 <.0001 0.20 / 
Comparison 2,931 409.08 45.99 (1.61) 

i-Ready Instruction 0.20 8,681 464.42 47.31 4.87 <.0001 0.10 
Comparison 8,583 459.55 48.23 (0.89) 

i-Ready Instruction 0.19 11,344 508.05 48.24 1.92 0.0046 0.04 
Comparison 11,282 506.14 47.98 (0.67) 

i-Ready Instruction 0.20 14,048 536.74 49.46 4.36 <.0001 0.09 
Comparison 14,026 532.38 49.91 (0.61) 

i-Ready Instruction 0.20 15,775 559.19 48.92 3.47 <.0001 0.07 
Comparison 15,659 555.72 50.98 (0.58) 

i-Ready Instruction 0.18 17,116 581.79 50.10 4.95 <.0001 0.10 
Comparison 16,812 576.85 52.19 (0.56) 


Notes: ICC = intraclass correlation, SD = standard deviation of i-Ready scores, Adj Mean Diff = adjusted mean difference between 
i-Ready Instruction and comparison groups, SE = standard error of the adjusted mean difference, and Effect Size = Hedge’s g. 
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Summary and Discussion 


At all grades, impact analyses suggest that elementary school students who use /-Ready 
Instruction with fidelity have higher achievement in reading when compared to students who did 
not use /-Ready Instruction. At each grade, students in the i-Ready Instruction group had a 
statistically significantly higher reading i-Ready Diagnostic score than did students in a matched 
comparison group. 


The effect sizes provided additional evidence /-Ready Instruction is beneficial for improving 
student reading. Recent research (Kraft, 2019) suggests education interventions typically attain 
effects ranging from 0.03 to 0.17. Our effect sizes for grades 1—5 fell within this range, and the 
effect size for Kindergarten exceeded it (0.20). Kraft (2019) notes one should consider various 
factors when interpreting effect sizes, including a program’s cost relative to its benefits and the 
size of the treatment contrast. For example, we note that /-Ready Instruction is a supplemental 
intervention that requires only 12 to 18 weeks of 30 minutes or more per week during a school 
year to be considered implemented with fidelity at the student level. In addition, because /- 
Ready Instruction is not a full curriculum and there are likely many similarities between what 
else students are exposed to whether in the /-Ready Instruction group or comparison group, we 
believe the contrast between our treatment and comparison group is likely minimal. Similarly, it 
is possible some students in our comparison group were exposed to interventions like i-Ready 
Instruction. Thus, given the required effort for using /-Ready Instruction with fidelity is relatively 
low, and the contrast between the /-Ready Instruction and comparison group small compared to 
a more involved intervention or curricular program, we feel confident that our effect sizes are 
meaningful. 


Kraft (2019) also points out that the U.S. education system is decentralized, and implementation 
procedures are ultimately controlled by local schools and/or teachers. As a QED, this study did 
not attempt to control for curriculum, Supplemental resources, or classroom structure. Students 
in both groups were not participants in a research study but rather they were actual customers 
and everyday users, and /-Ready Instruction was carried out in real-world conditions. We may 
have found even larger effect sizes had the study been conducted under more controlled 
circumstances. Impacts are typically greater for studies that aim for ideal or close to ideal 
implementation and less for studies that examine real-world implementation. Thus, the fact we 
were able to find significant findings for all grade levels despite the lack of controls is promising. 


We conducted this study differently from a past study using 2017-18 data by considering the 
unit of assignment to be the student instead of the school. Additionally, we used 2018-19 data 
to take advantage of the most recent available information. Despite these key differences, our 
results were highly consistent—both studies found positive, significant results in favor of i-Ready 
Instruction, and both studies had the largest effect sizes at the early grades. This replication 
provides confidence that students using /-Ready Instruction in conjunction with the i-Ready 
Diagnostic show greater reading achievement compared to a comparison group using /-Ready 
Diagnostic only. 


Our study was conducted as a rigorous QED to meet the current standards described by the 
WWC (WWC, 2017) to achieve a rating of Meets WWC Group Design Standards with 
Reservations. In addition, because we found statistically significant positive effects for all grades, 
this study meets the guidelines set forth by ESSA for a Level 2 (or Moderate) rating for evidence- 
based research (U.S. Department of Education, 2016). 
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Limitations and Implications for Future Studies 


This study provides strong evidence supporting reading /-Ready Instruction use for students. 
Through our long-standing relationship with Curriculum Associates and multiple impact 
evaluations, including the current study, we have developed recommendations for the foci of 
future studies that may provide additional evidence to support the impact of i-Ready Instruction. 


First, our |CCs were at or slightly above 0.20 for four of the six grades, suggesting school 
differences may be important for matching and estimating treatment effects. However, the data 
also revealed large variations in how many students at a given school or grade within a school 
used /-Ready Instruction with fidelity. Future studies may look to explore the grade or classroom 
as the unit of assignment. We also recommend Curriculum Associates collect information 
directly from schools to understand their intended implementation so this information can be 
incorporated into sample selection and analytic models. 


Second, we note our study was a QED with the typical limitations, including a lack of information 
on implementation decisions made at each school and within each classroom. We recommend 
randomized control trials (RCTs) in the future even if only a small sample of schools and 
students is included. We also suggest including only one district to allow greater control on 
implementation. 


Finally, our treatment group was compared to a matched comparison group using the /-Ready 
Diagnostic. \t is possible that use of i-Ready Diagnostic itself increases student achievement. 
However, the design of this study did not allow for an estimation of that impact. Further, use of 
the i-Ready Diagnostic only schools and students as a comparison group may have attenuated 
the effects of i-Ready Instruction use had this treatment group been compared to a “business- 
as-usual” comparison group. Future studies might examine the impact of /-Ready Instruction 
using a set of comparison schools and students not implementing any Curriculum Associates 
products. This would require an external achievement measure, potentially a state assessment, 
as the baseline and outcome measure. 


Quality Control Procedures 


We employed various quality control checks throughout the data cleaning, analysis, and 
reporting processes. HumRRO, Curriculum Associates, and Century Analytics worked together 
to identify a rigorous methodology based on implementation of i-Ready Instruction with fidelity, 
the WWC 4.0 standards, and ESSA Level 2 guidelines. 


Rules for identifying treatment and comparison groups were determined through collaboration 
between the three study partners. Curriculum Associates provided information on the various 
components of /-Ready Instruction and the frequency for which it should be used for 
implementation with fidelity. They also provided i-Ready Diagnostic and Instruction data to allow 
HumRRO and Century Analytics to empirically examine the extent to which these 
recommendations were followed by /-Ready Instruction schools. These discussions led to 
treatment and comparison group criteria in which all partners were confident. 


Data analysis work was completed collaboratively by HumRRO and Century Analytics. Century 
Analytics and HumRRO independently conducted matching and HLM analyses for each grade. 
The researchers reviewed results against each other and worked out any discrepancies. All 
results reported in this study were verified by researchers from both organizations. 
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Appendix A. /-Ready Instruction Theory of Action 


Impact Evaluation of Reading i-Ready Instruction for Elementary Grades using 2018-19 Data A-1 


The i-Ready Diagnostic is an adaptive assessment that assesses students on relevant skills in a challenging and engaging way, capturing insight about 
student learning down to the subskill level. Teachers are provided with precise, actionable data and instructional recommendations to more seamlessly 
differentiate dassroom instruction according to their students’ needs, saving teachers valuable time. This allows teachers to deliver more impactful 
instruction to increase student growth and proficiency. 


mer sso inom nator ~ 


The i-fleady Diagnostic is an adaptive assessment that assesses students 
three times a year on relevant skils in a challenging and engaging way, 

Capturing insight about individual student math and/or reading strengths 
and needs down to the sub-skill level 


The following implementation program components help to maximally 
leverage the + Ready Diagnostic scores for differentiated instruction: 


O Students access their customized /-Ready dashboard to view their data, 
performance, and progress. 


O Teachers attend Professional Development sessions to acquire /Ready 
skils and concepts. 


O Teachers ensure that students’ Ready Diagnostic scores are valid and 
reliable by: 
© Adequately preparing students before taking the +Ready Diagnostic 
ee ee 


© Planning to retest students with abnormal test resufts (ex: red rush 
fags) 

© Monitoring and observing students during the /-Ready Diagnostic 
adminstratian 


O Teachers access the /Ready dashboard and reports for: 
© Precise and actionable performance and growth data 
© Gear tools such as student can dos and next steps for instruction, 
inctuding grade placement levels that highlight student needs down 
to the sub-skill level 
© Student groups based on similar instructional needs 
© Typical and Stretch Growth values for each student 
Monitoring student growth over time 


O Teachers can display class goals, performance, and progress through data 
wails or other methods. 


CO Schoo! and district leaders provide necessary system support: serving as 
instructional leaders, supporting teachers with implementation, ensuring 
the required technology & in place, setting appropriate schedules to 
administer the }Ready Diagnostic, clearly communicating those 
administration windows, and accessing reports to view student and class 
data to better understand resource needs. 
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Appendix B. Impact HLM Coefficients 


Table B.1. HLM Results for Kindergarten Reading 


Student-Level Covariates 


Treatment Group Membership 8.73 1.61 5.44 <0.001 5.58 11.87 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.74 0.01 59.13 | <0.001 0.71 0.76 
Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 24.77 5.03 4.93 <0.001 14.92 34.63 
Female, ELL = 1, SpEd = 0, EcDis = 0 22.45 5,52 4.07  <0.001 11.64 33.26 
Female, ELL = 0, SpEd = 1, EcDis = 0 11.00 5.63 1.95 0.051 -0.04 22.04 
Female, ELL = 0, SpEd = 0, EcDis = 1 17.02 5.07 3.36 0.001 7.08 26.97 
Female, ELL = 1, SpEd = 1, EcDis = 0 -0.05 | 10.56 0.00 0.996 -20.74 20.65 
Female, ELL = 0, SpEd = 1, EcDis = 1 7.78 6.02 1.29 0.196 -4.02 19.59 
Female, ELL = 1, SpEd = 0, EcDis = 1 14.66 5.36 2.73 0.006 4.15 25.17 
Female, ELL = 1, SpEd = 1, EcDis = 1 13.51 7.79 1.73 0.083 -1.76 28.78 
Male, ELL = 0, SpEd = 0, EcDis = 0 26.04 5.03 5.17 <0.001 16.18 35.91 
Male, ELL = 1, SpEd = 0, EcDis = 0 14.23 5,52 2.58 0.010 3.40 25.05 
Male, ELL = 0, SpEd = 1, EcDis = 0 13.49 5,24 2.57 0.010 3.21 23.77 
Male, ELL = 0, SpEd = 0, EcDis = 1 18.21 5.10 3.57 <0.001 8.20 28.21 
Male, ELL = 1, SpEd = 1, EcDis = 0 2.15 7.82 0.28 0.783 -13.17 17.48 
Male, ELL = 0, SpEd = 1, EcDis = 1 7.40 5.36 1.38 0.168 -3.11 17.92 
Male, ELL = 1, SpEd = 0, EcDis = 1 13.04 5.37 2.43 0.015 2.50 23.57 
School-Level Covariates 
Charter or Magnet Designation 4.01 2.19 1.83 0.067 -0.28 8.30 
Traditional Elementary School 4.70 1.61 2.91 0.004 1.54 7.86 
Percent non-white students -0.09 0.04 -2.30 0.021 -0.16 -0.01 
Percent FRL students -0.14 0.04 -3.84 <0.001 -0.21 -0.07 
Locale — Suburban -3.23 3.01 -1.07 0.283 -9.12 2.66 
Locale — Rural 1.35 1.69 0.80 0.426 -1.97 4.66 
Locale — City -1.57 3.62 -0.43 0.664 -8.67 5.53 
Intercept 


Intercept 392.56 5.63 69.67 | <0.001 381.52 403.61 
Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Table B.2. HLM Results for First Grade Reading 


Student-Level Covariates 


Treatment Group Membership 4.87 0.89 5.47 <0.001 3.12 6.62 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.81 0.01 | 140.47 <0.001 0.80 0.82 
Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 12.05 2.99 4.03 <0.001 6.19 17.91 
Female, ELL = 1, SpEd = 0, EcDis = 0 9.35 3.27 2.86 0.004 2.94 15.77 
Female, ELL = 0, SpEd = 1, EcDis = 0 -0.20 3.28 -0.06 0.951 -6.63 6.23 
Female, ELL = 0, SpEd = 0, EcDis = 1 6.08 2.99 2.03 0.042 0.22 11.94 
Female, ELL = 1, SpEd = 1, EcDis = 0 -3.19 5.74 -0.55 0.579 -14.44 8.07 
Female, ELL = 0, SpEd = 1, EcDis = 1 -3.04 3.48 -0.88 0.382 -9.85 3.77 
Female, ELL = 1, SpEd = 0, EcDis = 1 7.23 3.13 2.31 0.021 1.10 13.37 
Female, ELL = 1, SpEd = 1, EcDis = 1 -4.83 5.10 -0.95 0.344 -14.84 5.17 
Male, ELL = 0, SpEd = 0, EcDis = 0 10.07 2.99 3.36 0.001 4.20 15.93 
Male, ELL = 1, SpEd = 0, EcDis = 0 8.60 3.31 2.60 0.009 2.11 15.09 
Male, ELL = 0, SpEd = 1, EcDis = 0 -0.22 3.11 -0.07 0.944 -6.32 5,88 
Male, ELL = 0, SpEd = 0, EcDis = 1 5.84 3.00 1.95 0.051 -0.03 11.72 
Male, ELL = 1, SpEd = 1, EcDis = 0 -3.98 4.39 -0.91 0.365 -12.58 4.63 
Male, ELL = 0, SpEd = 1, EcDis = 1 -5.70 3.19 -1.79 0.074 -11.96 0.55 
Male, ELL = 1, SpEd = 0, EcDis = 1 4.02 3.11 1.29 0.196 -2.07 10.11 
School-Level Covariates 
Charter or Magnet Designation -0.92 1.32 -0.70 0.485 -3.51 1.67 
Traditional Elementary School 1.76 0.85 2.06 0.039 0.09 3.44 
Percent non-white students -0.05 0.02 -2.38 0.018 -0.08 -0.01 
Percent FRL students -0.13 0.02 -6.72 <0.001 -0.17 -0.09 
Locale — Suburban 0.66 1.55 0.43 0.671 -2.37 3.69 
Locale — Rural 2.17 0.91 2.38 0.017 0.38 3.95 
Locale — City 1.86 1.82 1.02 0.307 -1.71 5.44 
Intercept 


Intercept | 457.68 3.27 | 139.87 <0.001 451.27 464.09 


Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Table B.3. HLM Results for Second Grade Reading 


Student-Level Covariates 


Treatment Group Membership 1.92 0.67 2.84 0.005 0.59 3.24 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.76 0.00 199.30 <0.001 0.76 0.77 


Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 23.60 2.46 9.60 <0.001 18.78 28.42 
Female, ELL = 1, SpEd = 0, EcDis = 0 18.98 2.68 7.09 <0.001 13.73 24.23 


Female, ELL = 0, SpEd = 1, EcDis = 0 13.52 2.63 5,13 <0.001 8.35 18.68 
Female, ELL = 0, SpEd = 0, EcDis = 1 21.36 2.46 8.67 <0.001 16.53 26.19 
Female, ELL = 1, SpEd = 1, EcDis = 0 15.39 4.99 3.08 0.002 5.61 25.16 
Female, ELL = 0, SpEd = 1, EcDis = 1 9.08 2.79 3.26 0.001 3.62 14.54 
Female, ELL = 1, SpEd = 0, EcDis = 1 17.90 2.59 6.91 <0.001 12.82 22.97 
Female, ELL = 1, SpEd = 1, EcDis = 1 0.72 3.92 0.18 0.854 -6.97 8.41 


Male, ELL = 0, SpEd = 0, EcDis = 0 23.54 2.46 9.58 <0.001 18.72 28.35 
Male, ELL = 1, SpEd = 0, EcDis = 0 18.77 2.68 7.01 <0.001 13.53 24.02 
Male, ELL = 0, SpEd = 1, EcDis = 0 14.87 2.53 5.87 <0.001 9.90 19.84 
Male, ELL = 0, SpEd = 0, EcDis = 1 20.04 2.47 8.11 <0.001 15.19 24.88 
Male, ELL = 1, SpEd = 1, EcDis = 0 12.86 3.41 3.77 <0.001 6.17 19.56 
Male, ELL = 0, SpEd = 1, EcDis = 1 8.84 2.62 3.37 0.001 3.69 13.98 
Male, ELL = 1, SpEd = 0, EcDis = 1 17.36 2.62 6.62 <0.001 12.22 22.50 
School-Level Covariates 


Charter or Magnet Designation -1.65 1.04 -1.58 0.113 -3.70 0.39 
Traditional Elementary School 0.61 0.63 0.97 0.334 -0.63 1.85 
Percent non-white students -0.02 0.01 -1.53 0.125 -0.05 0.01 
Percent FRL students -0.10 0.01 -6.79 <0.001 -0.12 -0.07 
Locale — Suburban -0.51 1.10 -0.46 0.643 -2.66 1.65 
Locale — Rural 0.16 0.68 0.24 0.810 -1.17 1.50 
Locale — City 1.53 1.26 1.21 0.226 -0.94 3.99 
Intercept 


Intercept | 490.11 2.64 185.55 <0.001 | 484.93 495.29 


Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Table B.4. HLM Results for Third Grade Reading 


Student-Level Covariates 


Treatment Group Membership 4.36 0.61 7.11 <0.001 3.16 5.56 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.80 0.00 | 222.51 <0.001 0.79 0.80 


Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 11.72 2.19 5.35 <0.001 7.43 16.02 
Female, ELL = 1, SpEd = 0, EcDis = 0 10.96 2.46 4.46 <0.001 6.15 15.78 


Female, ELL = 0, SpEd = 1, EcDis = 0 5.34 2.36 2.26 0.024 0.72 9.96 
Female, ELL = 0, SpEd = 0, EcDis = 1 9.06 2.18 4.16 <0.001 4.79 13.33 
Female, ELL = 1, SpEd = 1, EcDis = 0 1.17 4.14 0.28 0.776 -6.93 9.28 
Female, ELL = 0, SpEd = 1, EcDis = 1 0.06 2.41 0.03 0.979 -4.66 4.78 
Female, ELL = 1, SpEd = 0, EcDis = 1 8.34 2.28 3.66 <0.001 3.88 12.81 
Female, ELL = 1, SpEd = 1, EcDis = 1 -2.90 3.36 -0.86 0.388 -9,.48 3.68 


Male, ELL = 0, SpEd = 0, EcDis = 0 12.60 2.19 5.76 <0.001 8.31 16.90 
Male, ELL = 1, SpEd = 0, EcDis = 0 9.24 2.42 3.81 <0.001 4.49 13.99 


Male, ELL = 0, SpEd = 1, EcDis = 0 5.87 2.26 2.60 0.009 1.44 10.30 
Male, ELL = 0, SpEd = 0, EcDis = 1 9.06 2.18 4.15 <0.001 4.78 13.34 
Male, ELL = 1, SpEd = 1, EcDis = 0 6.01 3.45 1.74 0.082 -0.75 12.78 
Male, ELL = 0, SpEd = 1, EcDis = 1 0.59 2.30 0.26 0.798 -3.92 5.09 
Male, ELL = 1, SpEd = 0, EcDis = 1 6.33 2.28 2.77 0.006 1.86 10.80 

School-Level Covariates 
Charter or Magnet Designation 1.89 0.94 2.01 0.044 0.05 3.74 
Traditional Elementary School 0.09 0.59 0.15 0.882 -1.06 1.24 
Percent non-white students -0.05 0.01 -3.79 <0.001 -0.08 -0.02 
Percent FRL students -0.09 0.01 -6.68 <0.001 -0.11 -0.06 
Locale — Suburban 0.76 1.01 0.76 0.449 -1.22 2.74 
Locale — Rural 0.99 0.61 1.62 0.105 -0.21 2.19 
Locale — City -1.45 1.18 -1.23 0.217 -3.76 0.85 

Intercept 


Intercept | 529.17 2.36 | 224.29 <0.001 | 524.55 533.80 


Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Table B.5. HLM Results for Fourth Grade Reading 


Student-Level Covariates 


Treatment Group Membership 3.46 0.58 6.02 <0.001 2.34 4.59 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.80 0.00 | 232.65 <0.001 0.79 0.81 


Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 9.31 1.89 4.94 <0.001 5.62 13.01 


Female, ELL = 1, SpEd = 0, EcDis = 0 5.44 2.21 2.46 0.014 1.11 9.76 
Female, ELL = 0, SpEd = 1, EcDis = 0 1.82 2.06 0.88 0.378 -2.23 5,86 
Female, ELL = 0, SpEd = 0, EcDis = 1 6.65 1.87 3.56 <0.001 2.98 10.32 
Female, ELL = 1, SpEd = 1, EcDis = 0 -6.15 4.42 -1.39 0.165 -14.82 2.52 
Female, ELL = 0, SpEd = 1, EcDis = 1 -0.87 2.07 -0.42 0.673 -4,.93 3.18 
Female, ELL = 1, SpEd = 0, EcDis = 1 4.54 1.98 2.29 0.022 0.65 8.43 
Female, ELL = 1, SpEd = 1, EcDis = 1 -8.00 2.98 -2.68 0.007 -13.85 -2.16 
Male, ELL = 0, SpEd = 0, EcDis = 0 10.55 1.88 5.60 <0.001 6.86 14.24 
Male, ELL = 1, SpEd = 0, EcDis = 0 6.30 2.18 2.90 0.004 2.04 10.57 
Male, ELL = 0, SpEd = 1, EcDis = 0 2.73 1.97 1.39 0.165 -1.12 6.59 
Male, ELL = 0, SpEd = 0, EcDis = 1 6.82 1.87 3.64 <0.001 3.14 10.49 
Male, ELL = 1, SpEd = 1, EcDis = 0 -3.78 3.57 -1.06 0.291 -10.78 3.23 
Male, ELL = 0, SpEd = 1, EcDis = 1 1.29 2.01 0.64 0.521 -2.65 5.22 
Male, ELL = 1, SpEd = 0, EcDis = 1 4.37 1.99 2.20 0.028 0.47 8.27 

School-Level Covariates 
Charter or Magnet Designation -0.86 0.96 -0.90 0.371 -2.74 1.02 
Traditional Elementary School -1.17 0.58 -2.01 0.044 -2.31 -0.03 
Percent non-white students -0.01 0.01 -1.01 0.313 -0.04 0.01 
Percent FRL students -0.07 0.01 -5.86 <0.001 -0.10 -0.05 
Locale — Suburban 0.38 1.00 0.38 0.708 -1.59 2.35 
Locale — Rural 1.48 0.61 2.42 0.016 0.28 2.67 
Locale — City 0.14 1.15 0.12 0.902 -2.12 2.40 

Intercept 


Intercept | 553.50 2.07 | 267.14 <0.001 | 549.44 557.56 


Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Table B.6. HLM Results for Fifth Grade Reading 


Student-Level Covariates 


Treatment Group Membership 4.95 0.56 8.78 <0.001 3.84 6.05 
Fall 2018 Reading i-Ready Grand 
Mean Centered 0.82 0.00 | 248.71 <0.001 0.82 0.83 
Student-Level Stratum 
Female, ELL = 0, SpEd = 0, EcDis = 0 16.02 1.67 9.60 <0.001 12.75 19.29 


Female, ELL = 1, SpEd = 0, EcDis = 0 15.13 2.17 6.96 <0.001 10.87 19.38 
Female, ELL = 0, SpEd = 1, EcDis = 0 7.94 1.87 4.26 <0.001 4.28 11.60 
Female, ELL = 0, SpEd = 0, EcDis = 1 13.44 1.65 8.14 <0.001 10.20 16.68 


Female, ELL = 1, SpEd = 1, EcDis = 0 9.49 4.17 2.28 0.023 1.32 17.67 
Female, ELL = 0, SpEd = 1, EcDis = 1 8.99 1.90 4.72 <0.001 5.25 12.72 
Female, ELL = 1, SpEd = 0, EcDis = 1 10.01 1.80 5.55 <0.001 6.47 13.54 
Female, ELL = 1, SpEd = 1, EcDis = 1 4.61 2.73 1.69 0.091 -0.74 9.96 


Male, ELL = 0, SpEd = 0, EcDis = 0 15.69 1.67 9.41 <0.001 12.42 18.96 
Male, ELL = 1, SpEd = 0, EcDis = 0 14.63 2.06 7.11 <0.001 10.60 18.67 
Male, ELL = 0, SpEd = 1, EcDis = 0 9.63 1.77 5.43 <0.001 6.15 13.11 
Male, ELL = 0, SpEd = 0, EcDis = 1 13.44 1.65 8.13 <0.001 10.21 16.68 
Male, ELL = 1, SpEd = 1, EcDis = 0 6.58 3.19 2.06 0.039 0.33 12.84 
Male, ELL = 0, SpEd = 1, EcDis = 1 4.31 1.79 2.42 0.016 0.81 7.81 
Male, ELL = 1, SpEd = 0, EcDis = 1 8.38 1.80 4.66 <0.001 4.86 11.90 
School-Level Covariates 


Charter or Magnet Designation -0.34 0.93 -0.37 0.712 -2.17 1.48 
Traditional Elementary School -0.09 0.57 -0.16 0.870 -1.22 1.03 
Percent non-white students 0.02 0.01 1.62 0.105 0.00 0.05 
Percent FRL students -0.09 0.01 -7.04 <0.001 -0.11 -0.06 
Locale — Suburban 1.61 0.98 1.63 0.102 -0.32 3.53 
Locale — Rural 0.51 0.60 0.85 0.394 -0.66 1.68 
Locale — City -0.57 1.17 -0.48 0.628 -2.86 1.72 
Intercept 


Intercept | 566.35 1.86 | 303.78 <0.001 | 562.69 570.00 


Note. Stratum 16 and Locale — Town were used as reference groups in the model. 
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Appendix C. Model Assumption Checks 


We examined three model assumptions associated with two-level HLM—residual normality, 
independence, and homoscedasticity—using the MIXED_DX macro in SAS (Bell, Smiley, Ene, 
& Blue, 2014) based on the analytic model for all six grade levels of this study. The MIXED_DX 
macro provides visual output including box-and-whisker plots, histograms, scatter plots, and 
summary tables to examine residual normality, linearity, homoscedasticity, and influential 
outliers. The macro provides this information for level 1 and level 2 residuals. 


We reviewed plots and summary tables at level 1 and level 2 for each grade level. These 
checks provided assurance that our analytic model was appropriate for our data. We examined 
histograms, box and whisker plots, and scatter plots to check residual normality. These plots 
supported that our residuals were generally normally distributed, particularly, the histograms of 
level 2 residuals produced highly symmetrical bell shape with little skewness or kurtosis. The 
level 1 residuals had some skewness but were close enough to normal to allow confidence. 
There was no evidence when examining level 1 residuals of clearly non-normal distributions 
such as a bi-modal distribution. Violation of assumptions of normality of level 1 residuals can 
adversely affect estimation of random effect coefficients and variance-covariance components, 
but typically will not adversely affect estimation of standard errors and, therefore, inferences 
regarding statistical significance. Given the primary purpose of the models was estimating 
treatment effects, the slight lack of normality of the level 1 residuals likely did not have 
implications for the findings presented in this report. 


Scatter plots of predicted values against residuals at level 1 and level 2 clearly illustrated 
random distributions and provided support for that assumptions regarding independence and 
homoscedasticity were not violated. 
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